Loop2GPU: Transforming Loops to OpenCL Kernels as a LLVM Pass

نویسندگان

  • Semih Okur
  • Cosmin Radoi
چکیده

Lately, programmers have started to take advantage of the GPU capabilities of their systems. Still, programming for the GPU can be very hard. We are trying to hide some of this complexity from the programmer by making the compiler automatically transform embarrassingly parallel loops to GPU kernels. To this end, we have implemented a compiler pass that transforms simple loops to OpenCL kernels.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending the Capabilities of the Cray Programming Environment with Clang-LLVM Framework Integration

Recent developments in programming for multicore processors and accelerators using C++11, OpenCL and Domain Specific Languages (DSL) have prompted us to look into tools that offer compilers and both static and runtime analysis toolchains to complement the Cray Programming Environment capabilities. In this paper we report our preliminary experiences from using the CLang-LLVM framework on a hybri...

متن کامل

Clad — Automatic Differentiation Using Clang and LLVM

Differentiation is ubiquitous in high energy physics, for instance in minimization algorithms and statistical analysis, in detector alignment and calibration, and in theory. Automatic differentiation (AD) avoids well-known limitations in round-offs and speed, which symbolic and numerical differentiation suffer from, by transforming the source code of functions. We will present how AD can be use...

متن کامل

Enabling Loop Parallelization with Decoupled Software Pipelining in LLVM: Final Report

Software pipelining is an optimization technique used to speed up the execution of loops. A compiler performing the optimization reorders instructions within a loop in order to minimize latencies and avoid wasting instruction cycles. The optimization parallels the out-of-order execution paradigm used by microprocessors, except that instruction reordering is done at the software level, i.e. by t...

متن کامل

The Support of an Experimental OpenCL Compiler on HSA Environments

In recent years, with the increasing computing power and programmability on GPU, GPU has become an important role on hardware accelerator. Heterogeneous System Architecture (HSA) announced by HSA Foundation is an approach to benefit both CPUs and GPUs advantages. Open Computing Language (OpenCL) is one of the wellknown programming frameworks for parallel computing on heterogeneous architecture....

متن کامل

Automatic Generation of Optimized OpenCL Codes Using OCLoptimizer

The eruption of multicore processors and several kinds of accelerators has generalized the interest in parallel programming. The OpenCL standard is very appealing because it provides code portability across most of these platforms. It defines a programming model where a host code requests the execution of kernels in computational devices. Unfortunately, the host API of OpenCL is quite verbose, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012